Emergency Medicine Journal
● BMJ
Preprints posted in the last 30 days, ranked by how well they match Emergency Medicine Journal's content profile, based on 20 papers previously published here. The average preprint has a 0.04% match score for this journal, so anything above that is already an above-average fit.
Jayaprakash, A.; Liberati, E.; Lindsay, R.; Willars, J.; Gibson, J.; Fritz, Z.; Price, A.; Hatfield, T.; Richards, N.; Martin, G.
Show abstract
Objectives People with mental health conditions experience increased rates of diagnostic errors and delays in acute treatment. While causes such as diagnostic overshadowing (misattribution of physical symptoms to mental health conditions) are well documented, less attention has been paid to the organisational and structural conditions that shape diagnostic work. This study examines how physical illness is diagnosed in patients with mental health conditions in emergency departments (EDs), with a focus on the structural conditions that enable or constrain safe diagnostic practice. Method We conducted a multi-site ethnography across three purposively selected EDs in England between April 2023 and April 2024, varying in size, population demographics, and local service configuration. Data were collected through 284 hours of non-participant observation and 20 semi-structured interviews with ED staff. Results Our analysis identified four recurring structural gaps that shaped the conditions under which physical health diagnosis took place for patients with mental health conditions: a design gap, whereby targets and physical layouts constrained diagnostic reasoning; a preparedness gap, reflecting the lack of structural support to allow staff to act on their existing knowledge and skills; a coordination gap, reflecting fragmented ownership and the challenges of joint assessment across mental and physical healthcare teams; and an expectation gap, whereby unmet need elsewhere in the system increased demand for ED services that were beyond its formal scope. These gaps made diagnostic errors and delay more likely for patients with mental health conditions seeking physical healthcare in the ED. Conclusions As new dedicated mental health EDs are introduced in England, there is an opportunity to avoid reproducing these structural gaps in new settings. Our study suggests that improving physical healthcare for patients with mental health conditions requires changes to how EDs are designed, resourced and supported, and how they connect with the wider health and care system. Keywords: mental health, diagnostic inequality, emergency departments
Bokman, J. T.; Singapore PAROS Investigators, ; Ee, S.; Fook-Chong, S. M. C.; Binte Ahmad, N. S.; Leong, B. S.; Chia, M. Y. C.; Okada, Y.; Ong, M. E. H.; Siddiqui, F. J.
Show abstract
Background Bystander automated external defibrillator (BAED) use improves out-of-hospital cardiac arrest (OHCA) outcomes but remains uncommon globally. This study evaluated the outcomes of Singapore's 11-year public-access AED expansion and volunteer-responder implementation in terms of trends in BAED use, associated factors, and clinical outcomes. Methods This population-based, retrospective cohort study used Singapore Pan-Asian Resuscitation Outcomes Study (SG-PAROS) data (2010-2020) for adult, non-traumatic OHCAs. The primary outcome was bystander AED application. Multivariable logistic regression identified factors associated with use. Secondary outcomes included favorable neurological status (CPC 1-2), survival to discharge, and prehospital return of spontaneous circulation (ROSC). Results Of 21,439 included OHCA cases (median age 70.0 years; 63.8% male), BAED use increased from 1.7% to 9.6% over 11 years, with a corresponding increase in overall survival from 2.4 to 4.0%. Malay ethnicity (aOR 1.25, 1.06-1.49), calendar year (aOR 1.26, 1.22-1.29), and delayed emergency medical services (aOR 1.24, 1.06-1.45) were positive predictors of BAED use. Conversely, BAED use was lower among females (aOR 0.80, 95% CI 0.69-0.94), at night (aOR 0.69, 0.56-0.86), and in residential settings (aOR 0.06, 0.05-0.07). Volunteer arrival strongly increased application (aOR 4.16, 3.41-5.09), with a significant interaction (p<0.001); the effect was greater in residential (aOR 7.38, 5.81-9.38) than non-residential settings (aOR 1.71, 1.22-2.40). AED use predicted favorable neurological outcome (aOR 2.80, 2.24-3.50; NNT 8.7), survival (aOR 2.30, 1.89-2.80), and ROSC (aOR 2.11, 1.81-2.46). Conclusion Over 11 years, we saw a significant increase in BAED application and favorable neurological survival. This success was associated with the implementation of an integrated strategy combining widespread AED deployment, national training, and smartphone-activated volunteer responders. Singapore's experience provides a scalable model for urban centers seeking to expand their AED strategy.
Ries, M.; von der Forst, M.; Schaefer, H.; Bikowski, K.; Franzen, K.; Geoerg, P.; Weykamp, F.; Popp, E.; Kuellenberg, J.
Show abstract
Background: In crises, hospitals must rapidly shift from routine operations to structured crisis management, requiring the activation of an incident command system. However, empirical insight into their operational functioning during activation remains limited. Goal: to identify operational enablers and barriers influencing effective crisis response. Methods: Prospective cross-sectional, qualitative, single-center study conducted after a table-top exercise within a hospital incident command system at a tertiary care university hospital (NCT06913010). Data was collected through semi-structured interviews, participant observation, and document analysis, and analyzed using a narrative-phenomenological approach. Results: Nineteen participants were included. Analysis identified nine thematic clusters shaping operational performance: (1) structure and roles; (2) communication; (3) decision-making and prioritization; (4) information management; (5) infrastructure and technology; (6) personnel and organization; (7) training, exercises, and team dynamics; (8) documentation; and (9) external communication and media. Enablers included clear role definition, structured communication, phased decision-making, and regular training. Barriers included role ambiguity, fragmented communication, insufficient prioritization, infrastructure limitations, and staffing constraints. Conclusion: Preparedness frameworks are necessary but insufficient as stand-alone approaches, as operational execution determines real-world performance. Recurring deficits included unclear roles, inconsistent communication, weak prioritization, and gaps in infrastructure and personnel. A limited set of standardized practices - including a clear separation od roles, leadership intent, closed-loop communication, explicit decision cycles from information gathering to structuring to decision-making, checklists, visualization, central information management, and rapid "80% decisions"-substantially enhanced performance. Mission command (Auftragstaktik) further enabled adaptive, coordinated action. Strengthening hospital incident command is a key lever for achieving system-level resilience in crises.
Weber, K.; Stassen, W.; Jayaraman, S.; Odland, M. L.; Nishimwe, A.; Welgama, I.; Wallis, L.; Ignatowicz, A.; Davies, J. P.
Show abstract
Introduction -- Emergency Medical Dispatch Systems (EMDS) can reduce delays in accessing emergency care by providing structured communication, triage, and coordination. However, such systems remain absent or underdeveloped in most low- or middle-income countries (LMICs). This study aimed to establish international consensus on essential EMDS components to inform global guidance. Methods -- We convened a multidisciplinary expert group to draft a preliminary list of essential components for three EMDS levels reflecting resource availability and system maturity. We then conducted a three-round Delphi with international experts to reach consensus on core EMDS components. Components which had [≥]75% agreement were included, those with [≥]75% disagreement were excluded. Components not achieving consensus by Round 3 were removed. Results were analysed overall and stratified by respondents' country income level. A subsequent online expert meeting resolved inconsistencies and finalised the component list. Results -- The expert group generated 111 components for each of three EMDS levels (Foundational, Emerging, and Established) spanning 11 operational domains. Of the 68 experts invited to the Delphi, 43 participated in Round 1 and 30 in Round 3. Across all Delphi rounds, 289 components reached consensus for inclusion. The consensus resulted in a final list of 227 components (63 Foundational, 84 Emerging, and 80 Established). Consensus agreement clustered around core EMDS domains including communication, structured call-taking and prioritisation, advice-giving, resource dispatch and tracking, and foundational governance and data functions, whereas items showing either non-consensus or consensus disagreement were typically technology-dependent or context-specific. Conclusions -- This international consensus offers guidance for EMDS development across diverse resource settings and provides a scalable roadmap to strengthen emergency care systems.
Soma, G.; Mercado, L.; Rayo, J.; Armstrong-Hough, M.; Bernstein, S. L.; Abroms, L.; Ngaruiya, C.
Show abstract
Abstract Background: Emergency Department (ED) populations are a high-risk group that are opportune for interventions targeting NCDs and NCD risk factors, like tobacco use. Mobile health (mHealth)interventions such as Text2Quit, a novel text message-based mHealth tool addressing tobacco cessation in the US, have demonstrated effectiveness for tobacco cessation and for ED-based mHealth interventions in High Income Countries (HIC). To successfully adapt and implement such mHealth interventions in limited resource settings like African EDs, it is essential to examine the implementation climate and engage key stakeholders. These implementers provide invaluable insight to understand healthcare system level factors that affect adoption, implementation and maintenance of the interventions. Methods: We conducted 12 semi-structured key informant interviews (KIIs) with ED administrators and staff including 2 departmental heads, 5 medical doctors, 3 nurses, and 2 clinical officers at a national referral hospital in Kenya. This was guided by RE-AIM framework indicators of Adoption, Implementation, and Maintenance (eg feasibility of intervention integration, and suggestions to improve implementation). Interviews were conducted in English, recorded, professionally transcribed and translated, and analyzed using a constant comparative analysis approach, according to grounded theory principles. Findings: Key informants were positive about the adoption of them Health intervention in Kenyan EDs and across different health facility levels in Kenya due to the perceived need for the program, facility and staff receptiveness and existing healthcare infrastructure to leverage. Recommended implementation strategies included follow-up mechanisms for program participants, inclusion of all healthcare cadres in implementation and increased sensitization and the use of champions. Barriers to Implementation in the ED included competing clinical priorities with emergency cases, limited staffing and shame associated with smoking. Conclusion: Implementing a mobile health tobacco cessation program like Text2Quit is feasible and acceptable in Kenyan EDs when supported by targeted strategies.
Dworkis, D. A.; Stenstrom, J.; Sen, A.; Lucarelli, R. T.
Show abstract
Background: Stroke is a time-sensitive neurological emergency in which early EMS activation and presentation to definitive care are cornerstones of effective therapy. Large language models (LLMs) are increasingly consulted by the public for medical advice, but the veracity of the guidance provided by commercially available models responding to potential stroke symptoms is not well understood. Methods: We performed a cross-model benchmarking study comparing the triage choices of three frontier LLMs (Claude Sonnet 4.6, GPT-4o, and Llama 3.3-70b-versatile) on first-person vignettes describing a unilateral arm symptom on waking, across 10 symptom descriptors, and two clinical phases (before and after a partially reassuring self-examination), with or without a clinical distractor (n=50 per condition). Results: Claude sought emergency care most often, Llama least, and GPT-4o in between, diverging most sharply in the post-examination phase where Claude called 911 in 100% of runs, Llama called for non-emergency help in 100%, and GPT-4o was symptom-dependent. A distractor shifted behavior away from emergency care in almost all conditions: calling 911 fell from 37.9% to 14.6% and waiting rose from 0% to 45.9% in the post-examination vignette. Responses were also sensitive to symptom word: weak, limp, heavy, and clumsy generated higher alarm, whereas numb, tingly, odd, strange, and weird generated less urgent responses. Conclusions: The increasing use of LLMs for medical advice has significant public health implications. Commercially available LLMs show significant model-to-model variability and framing sensitivity when confronted with potential stroke symptoms, including under-recognition of canonical CDC warning descriptors, underscoring the need for systematic benchmarking as these tools become de facto first points of contact for patients experiencing neurological emergencies.
McCann, K. A.; Wright, D. S.; Iscoe, M. S.; Melnick, E. R.; Ohno-Machado, L.; Meeker, D.; Venkatesh, A. K.; Sangal, R. B.; Loza, A. J.
Show abstract
Importance: Abdominal pain causes roughly 10 million US emergency department (ED) visits annually, most resulting in discharge. Post-discharge courses vary, yet existing risk models predict only whether an ED revisit occurs, not what that revisit outcome will entail. Objective: To evaluate whether Curiosity, a generative medical event foundation model, can predict post-ED-discharge trajectories for adults with abdominal pain, differentiating the timing and severity of expected outcomes. Design: Retrospective cohort study; encounters January 1-December 31, 2022; 30-day follow-up; analysis conducted in 2026. Setting: Epic Cosmos research network (multicenter, population-based, de-identified electronic health record). Participants: Adults ([≥]18 years) discharged from the ED with abdominal pain, excluding training-set patients. Random sample of 3,000 drawn from 150,030 eligible patients (65.3% female; median age 47 years [IQR 36-60]). Exposure: ED discharge after evaluation for abdominal pain. Main Outcomes and Measures: Primary: Curiosity model vs. per-task, separately estimated XGBoost models on area under the receiver operating characteristic curve (AUROC) for ED revisit ending in admission (admit-revisit), ED revisit ending in discharge (DC-revisit), and any ED revisit at 72 hours, 7 days, and 30 days. Secondary: trajectory-level accuracy across 36 trajectory classes and edit distance vs XGBoost; calibration of simulated vs observed conditional path probabilities across 45 transitions. Results: Curiosity identified patients at high risk of revisit requiring admission more accurately than XGBoost and differentiated those likely to revisit without admission. Among 3,000 patients, Curiosity's 30-day admit-revisit AUROC was 0.83 (95% CI 0.79-0.87) vs 0.70 (95% CI 0.65-0.75) for XGBoost (DeLong P<.001), and admit-revisit AUC-PR was 0.37 (95% CI 0.29-0.46) against a 4.1% cohort base rate, vs XGBoost 0.13 (95% CI 0.09-0.19). Curiosity identified the most likely trajectory out of 36 possibilities for 45.9% of patients (XGBoost 41.0%; McNemar P<.001), with median edit distance 1.28 vs 1.40 (Wilcoxon P<.001). Median absolute calibration error across 45 transitions was 1.30 percentage points (95% CI 0.32-2.49). Conclusions and Relevance: A generative medical event foundation model produced calibrated trajectory-level predictions and discriminated admit-revisits more effectively than task-specific XGBoost baselines, separating patients that revisited and were admitted from those who revisited and were discharged.
Cussens, J.; Do, K.; Chambers, E. V.; Crum, A.; Burton, C.
Show abstract
Background High Intensity Use of urgent medical services by patients is widely recognised in urgent and emergency care. Studies of high intensity use of the emergency department have consistently shown features of complex systems behaviour in addition to highly heterogeneous individual patient characteristics. There have been no comparable studies of prehospital care use. Methods We examined the use of prehospital urgent and emergency services (NHS 111 and ambulance dispatch) using routinely collected data from regional service in the UK (population 5 million). We used a complex systems perspective, to examine (1) distribution of contacts per individual; (2) the temporal stability of service use by individuals and at the whole-system level (3) the distribution of bursts of contacts. Results We analysed data from 847555 individuals who contacted NHS111 and 389550 who contacted the ambulance dispatch service. 35120 (4.2%) individuals who contacted NHS111 had 5 or more contacts with the service over the two-year period and accounted for 290625 (20.1%) of contacts. 16755 (4.3%) individuals had 5 or more ambulance dispatch contact days and accounted for 169085 (25.8%) of contacts. The distribution of contacts per individual showed a monotonic distribution between 5 and over 100 contacts that was heavy tailed and compatible with a power law distribution. At any level of use, patients with one or more mental health related contacts had a greater likelihood of further contact than those without. Conclusion Prehospital emergency service use shows multiple statistical features typical of a complex system. Interventions to manage demand need to consider both individual high intensity users (particularly in relation to their mental health) and the behaviour of the whole system.
he, b.; Cheng, S.-B.; Liu, M.; Li, M.
Show abstract
Background Complicated appendicitis (CA) increases morbidity and resource use.[1,2] In the emergency setting, risk stratification must rely on rapidly available data. Procalcitonin (PCT) is frequently obtained, but its incremental value beyond basic preoperative indicators remains uncertain.[5] We aimed to quantify PCTs incremental predictive value and develop a practical bedside score with temporal validation. Methods We conducted a retrospective cohort study of consecutive laparoscopic appendectomy patients (January 2023-December 2024). CA was defined by postoperative pathology (gangrene/necrosis, perforation, or peri-appendiceal inflammation/abscess; worst-category rule). We compared a base logistic model (age, WBC, neutrophil percentage, fever, symptom-to-surgery interval, shock index) with an extended model adding log-transformed PCT. Discrimination (AUC) and calibration were assessed. Temporal validation used 2023 for development and 2024 for testing. We also created a simple bedside score using pre-specified cutoffs and evaluated CA risk across score strata in 2024. Results In the overall complete-case cohort (n=1,792), 397 patients (22.2%) had CA. Adding PCT modestly improved discrimination in the full cohort (AUC 0.673 to 0.685). For temporal validation, 2023 included 870 patients (CA 26.9%) and 2024 included 921 patients (CA 17.7%); one otherwise eligible patient lacked a usable admission year. In the 2024 test set, discrimination was 0.662 (base) vs 0.673 (base+PCT) with a non-significant AUC difference (DeLong p=0.116); calibration slopes were near 1.0. A 7-item bedside score stratified 2024 CA risk: 9.1% (score 0-1), 14.7% (2-3), and 34.2% [≥]4). Using [≥]4 points identified a higher-risk subgroup (PPV 34.2%, NPV 87.5%, sensitivity 46.0%, specificity 81.0%). Conclusions PCT adds modest predictive information beyond simple preoperative indicators in the full cohort, but temporal validation suggests that this incremental gain is smaller and not statistically significant in later patients. A pragmatic bedside score can support CA risk stratification and prioritization in emergency care, whereas the role of routine PCT testing may be best reserved for selected situations in which uncertainty remains after initial assessment.
Healy, J.; Marvasti, A.; Wallace, D.; Baheerathan, A.; Ghosh, A.; Kossoff, J.; Thio, S.; Balaratnam, M.; Haider, S.; Ellershaw, S.; Dobson, R.
Show abstract
Background: Large language models (LLMs) demonstrate strong performance in controlled medical environments such as multiple choice exams, but their utility in real-world clinical workflows remains unproven. The NHS Advice & Guidance (A&G) service, where Primary Care clinicians can submit text-based queries to specialists, provides an environment for evaluating the clinical performance of LLMs as a specialist. Methods: We compared responses from MedGemma 4B-IT, an open-weight model deployed locally on hospital infrastructure, against specialist neurologist responses across 50 adult neurology A&G cases from University College London Hospital. Two neurologists and two GPs rated 80 blinded and 20 unblinded responses for outcome, safety, efficacy, and feasibility using standardised criteria; outcome was a binary correct/incorrect, while other domains were scored 1-5. Inter-rater reliability was assessed using intraclass correlation coefficients. Results: Although there were no statistically significant differences between blinded specialist neurologists and LLM responses across any domain (outcome: 84% vs 82%, p=0.67; safety: 3.98 vs 4.02, p=0.85; efficacy: 4.06 vs 3.98, p=0.61; feasibility: 4.39 vs 4.20, p=0.45), 10% of LLM responses received concerning scores ([≤]2 average score) compared to 0% of human responses, indicating potentially clinically important tail risk. Furthermore, unblinded results showed a preference for human responses, with human ratings being preferred across all domains. Only 51% of binary outcomes had unanimous agreement and inter-rater agreement was moderate across other domains (ICC 0.50-0.52). Conclusions: In this pilot study, aggregate scores between blinded human and LLM responses were similar, and no statistically significant differences were detected in this exploratory sample. However, aggregate metrics masked clinically important edge-case failures in LLM responses. Pronounced inter-rater variability and the potential impact of LLM/human syntax on blinded rater judgements highlight the challenges in establishing robust evaluation frameworks for clinical LLM deployment
Creutzfeldt, C. J.; Leonhardt-Caprio, A.; Nielsen, E.; Lee, R. Y.; Wahlster, S.; Holloway, R. G.; Reinke, L. F.
Show abstract
Importance: Severe stroke is a leading cause of death and disability worldwide. Survivors and their families face long-term unmet needs, including care that does not reflect patients' values, fragmented care, and high rates of psychological distress among caregivers. Objective: To describe the conceptual framework of the longitudinal transdisciplinary neuropalliative care support (LOTUS) intervention and assess its fidelity in a pilot feasibility study. Design: Pilot feasibility randomized study; fidelity was assessed using weekly checklists completed by the LOTUS nurse and qualitative analysis of weekly LOTUS team meeting transcripts. Setting: Single comprehensive stroke center in Western New York. Participants: Patients hospitalized with severe stroke and their caregivers. Dyads were randomized to usual care or intervention. Intervention: The LOTUS intervention is implemented in a stepped-care fashion using 5 strategies: Awareness, Assistance, Adjustment, Acceptance and Alignment (5As). Led by a specially trained nurse with a chaplain, social worker, psychologist, and neuropalliative care physician, the LOTUS team follows dyads from early in the hospital course through 6 months. Main Outcomes and Measures: Fidelity, the degree to which the intervention was delivered as intended, assessed via (1) utilization of 5A activities from weekly LOTUS checklists; (2) thematic analysis of weekly LOTUS team meeting transcripts. Results: Of 26 patients in the trial, 13 were randomized to intervention. The LOTUS nurse completed 108 checklists, with an average of 619 minutes of direct contact per participant over 6 months. Each component of the 5A's was utilized. Awareness and Assistance predominated early after enrollment and revolved around personhood, support, and self-efficacy. Adjustment was especially relevant during care transitions and was typically supported by the LOTUS social worker. Acceptance and Alignment were more prevalent during later meetings, with the LOTUS psychologist supporting identification and modeling of coping skills and the LOTUS physician guiding prognosis and goals-of-care conversations. The LOTUS nurse served as primary point of contact, providing continuity and a trusting relationship, while other team members functioned in a predominantly advisory role. Conclusions: The LOTUS intervention was delivered with fidelity to the 5A-framework, supporting a future randomized clinical trial to evaluate its efficacy in patients with severe stroke and their caregivers.
Bunker, A. L.; Engelberg, R. A.; Holloway, R. G.; Creutzfeldt, C. J.
Show abstract
INTRODUCTION Severe acute brain injury (stroke, traumatic brain injury or hypoxic-ischemic encephalopathy; SABI) is increasingly recognized as a chronic condition with care and communication needs beyond the initial hospitalization. This study aimed to characterize post-acute care patterns among SABI survivors, focusing on healthcare utilization and outpatient communication. METHODS Data were collected from a prospective cohort of hospitalized SABI patients using surveys, chart reviews, and the ED Information Exchange database. Socioeconomic disadvantage was assessed using the Area Deprivation Index (ADI), and qualitative analysis of outpatient notes examined conversations around palliative care needs and goals-of-care. RESULTS Two-thirds of patients (140/222) survived until discharge, primarily to nursing facilities (39%) or inpatient rehabilitation (38%). Among 109 with one-year follow-up, there were 89 hospitalizations, 104 ED visits, and 28 deaths. Patients from the most disadvantaged neighborhoods had significantly higher odds of rehospitalization or ED use within 30 days (OR 3.37, p=0.036). ADI was not linked to one-year utilization. seen outpatient by primary care (40%), neurology/neurosurgery (57%), and palliative care (1%), but conversations rarely revisited prognosis or goals-of-care. CONCLUSIONS Our findings highlight the need for improved long-term care planning and communication, particularly for socioeconomically disadvantaged survivors of SABI.
Trujillo-Vega, F.; Lopez-Delgado, P. A.
Show abstract
Abstract Background: Mean platelet volume (MPV) is a simple, low-cost biomarker that reflects platelet activation. Its prognostic value in septic shock remains controversial. We aimed to determine whether MPV at intensive care unit (ICU) admission is associated with hospital mortality in patients with septic shock. Methods: Retrospective cohort study of consecutive adults with septic shock (Sepsis-3 criteria) admitted to a single ICU. MPV, severity scores (SOFA, APACHE II, SAPS II), procalcitonin, and clinical data were collected. The primary outcome was in-hospital mortality. Spearman correlation, univariate and multivariate logistic regression (with Firth's correction), ROC curves, and subgroup analyses were performed. Results: Fifty-eight patients were included; mortality was 58.6%. MPV did not differ between non-survivors and survivors (13.09 {+/-} 1.37 vs. 12.66 {+/-} 1.45 fL, p = 0.259). MPV showed a weak correlation with procalcitonin ({rho} = 0.394, p = 0.002) but not with severity scores. In multivariate analysis adjusting for age, sex, SOFA and comorbidity count, MPV was not an independent predictor of mortality (OR 1.075, 95% CI 0.682-1.755, p = 0.749). The area under the ROC curve for MPV was 0.598 (95% CI 0.444-0.752), significantly lower than that of SOFA (0.837) and procalcitonin (0.836). Subgroup analyses showed no significant association between MPV and mortality in any stratum. Conclusions: In this cohort of septic shock patients, MPV at ICU admission was not associated with hospital mortality and had poor discriminative ability. Widely used severity scores and procalcitonin remain superior prognostic markers. MPV should not be used as a prognostic tool in septic shock. Keywords: Septic shock, Mean platelet volume, Mortality, SOFA, Procalcitonin, Biomarker
Palmer, D. D. G.; Palmer, S.; Darracott, B.; Stone, K.
Show abstract
Introduction Functional neurological disorder (FND) is a common cause of neurological disability and is associated with substantial healthcare utilisation and cost. Most available treatments target specific symptom subtypes, and prospective evidence regarding the effect of treatment on health-system costs remains limited. We evaluated the real-world clinical and economic outcomes of a transdiagnostic outpatient intervention, attention-based rehabilitation (ABR). Methods We conducted a pragmatic waitlist-controlled study in 54 consecutively referred patients with neurologist-diagnosed FND attending a specialist outpatient service. Clinical outcomes--including quality of life (Short Form-36), social and occupational participation (Work and Social Adjustment Scale), symptom severity, and mental health (Hospital Anxiety and Depression Scale)--were assessed at waitlist entry, treatment commencement, treatment completion, and 6 and 12 months post-treatment. Healthcare utilisation and costs were obtained prospectively from health-service financial records for the 6 months preceding treatment, the treatment period, and two consecutive 6-month post-treatment periods. Longitudinal clinical outcomes and healthcare costs were analysed using Bayesian mixed-effects and mixture models, respectively. Results All clinical measures remained stable or worsened during the waitlist control period. Across treatment, six of eight SF-36 domains, WSAS, employment status, and both HADS subdomains improved, with maintenance through 12 months. Patient-reported symptom improvement persisted post-treatment. Expected monthly health system costs approximately halved post-treatment, with net cost savings by approximately 50 days. Conclusion A fixed-duration, symptom-agnostic outpatient ABR programme was associated with durable improvements in functioning and quality of life, alongside substantial reductions in healthcare utilisation and cost, supporting scalable symptom-agnostic treatment models for FND.
Hawkins, R. L.; Cotterill, C.; McCormick, S.; Kellar, I.; Lobo, A. J.; Sampson, F. C.
Show abstract
Background Unplanned hospital admissions in Inflammatory Bowel Diseases (IBD) account for nearly three-quarters of IBD inpatient stays in the United Kingdom. Although costly to services and distressing for patients, research exploring experiences and potential drivers of admissions is limited. We undertook a qualitative study to explore the healthcare experiences and access needs of people with IBD who had unplanned admissions, along with their caregivers and clinicians. Methods Semi-structured interviews with 25 participants from a single tertiary IBD service in England (17 people with IBD, 3 informal caregivers, 5 clinicians) were conducted. We applied thematic framework analysis, guided by the Candidacy Framework, and worked with 2 patient and public contributors to generate final themes. Results We identified four themes: 1) Difficulties in Identifying flares and asserting severity before admission, summarised the prevailing uncertainty in identifying a flare and access to timely IBD care. 2) Navigating a disjointed healthcare system, highlighted how lack of care plans and systemic barriers can delay access. 2) Emergency care access challenges highlighted the gaps in emergency and inpatient care during flares. Whilst 4) fighting for care and individual advocacy needs, described the persistent assertion for care that may disproportionally impact access to vulnerable groups, also highlighting the importance of positive interpersonal relationships. Conclusions Individual, interpersonal and healthcare factors across the patient pathway were perceived to shape access to care in unplanned IBD admissions. Potentially reducing admissions requires proactive strategies, including the integration of patient education, monitoring tools, establishment of specialist rapid-access pathways, and formal psychological support to address barriers to access.
Preiksaitis, C. M.; Hughes, J.; Iscoe, M.; Makutonin, M.; Rider, A.; Melnick, E.; Rose, C.
Show abstract
Objectives: Electronic Health Records (EHRs) impose a significant time burden on physicians, often requiring work to be completed outside of scheduled hours. While this burden is well-documented, how it evolves throughout emergency medicine (EM) residency remains poorly understood. This study aimed to quantify EHR usage patterns, analyze the composition of after-shift work, and characterize the development of EHR efficiency across EM training. Methods: We conducted a retrospective cohort study of EM residents (postgraduate year [PGY] 1-4) using 5.5 years of EHR audit log data (2020-2025) at a single academic institution. We analyzed EHR time per new patient encounter, stratified by postgraduate year, and categorized activities into domains such as documentation, chart review, and orders. EHR work was measured both during and after scheduled shifts. Results: The analysis included 144 unique residents and 167,010 new patient encounters across 15,386 shifts. Encounter-attributed EHR time per encounter decreased by 52% from PGY-1 to PGY-4 (median 19.9 to 9.6 minutes, p<0.001), despite an 86% increase in patient volume per shift (median 7 to 13 encounters). This efficiency gain was driven primarily by a 69% reduction in documentation time (9.3 to 2.9 minutes), accompanied by shorter notes. After-shift work (EHR activity after the 9-hour clinical shift) was present in 89.9-94.4% of encounters. At the shift level, combined after-shift EHR time (encounter-attributed plus tracking board) was a median of 64.2 minutes per shift for PGY-1 and 104.2 minutes for PGY-4. Shift-level tracking board activity dominated the after-shift burden and increased with training (median 40.2 to 79.0 minutes per shift from PGY-1 to PGY-4). Conclusions: EM residents achieve substantial gains in on-shift EHR efficiency, with the largest reductions observed in documentation time, accompanied by shorter notes and faster input speed. However, a persistent after-hours workload, dominated by administrative and patient flow tasks, suggests that (at least at this single institution) system-level factors--not just individual skill--may contribute to this pattern. Monitoring these objective EHR metrics may help programs identify struggling learners and evaluate the impact of interventions aimed at improving resident well-being and workflow efficiency.
Namian, S.; DiBiase, R.; Elnazer, S. H.; Evers, C.; Fung, C.; Narula, R.; Rafferty, M.; Salahuddin, A.; Sardana, D. J.; Shea, J.; Sullivan, M.; Forman, R.
Show abstract
Background: High school students may be able to communicate health topics to peers and adults. Yet, few studies have evaluated the role of high school students in community health initiatives, making them an underutilized group for disseminating health information. We pilot tested stroke education across five high schools using varied delivery approaches as a preliminary step toward evaluating youth stroke education to improve community health. Methods: In April-May 2025, five high schools in Connecticut and New York participated in stroke education. The format was designed to fit the needs of each school and included an 8-session classroom curriculum (Derby, CT), after-school club meetings (New Haven, CT; Long Island, NY), and one large assembly (Bridgeport, CT). Developed by teachers and neurology providers, the curriculum covered stroke risk factors, symptoms, and emergency response. Students completed a 15-point assessment adapted from the validated Stroke Action Test before, immediately after, and 4-6 weeks post-intervention; data were collected between April and July 2025. Results: Of 112 students completing the pre-test, 99 (88%) completed the immediate post-test and 51 (46%) the delayed follow-up. Average scores rose from 47% pre-intervention to 75% post and 70% at 4-6 weeks. All schools scored <50% on pre-tests suggesting poor baseline stroke knowledge. Conclusion: This pilot suggests that stroke education can be delivered to high school students across varied settings and may support knowledge gains up to 6 weeks. Limitations included small sample sizes and missing follow-up data. If validated in larger studies, this adaptable, teacher-supported approach could offer a scalable public health strategy for improving community stroke preparedness.
Charfeddine, N.; Schranz, M.; Schlump, C.; Rupprecht, M.; Ullrich, A.; Diercke, M.; AKTIN Research Group, ; Estupinan Mendez, J.
Show abstract
Background: Mass gathering events (MGEs) are associated with several public health challenges and may cause a strain on healthcare services. Literature findings on the impact of MGEs on emergency departments (EDs) are heterogeneous. Objectives: To examine shifts in ED attendance characteristics during a major sporting tournament, namely the UEFA European Football Championship 2024 held in Germany. Methods: We conducted a retrospective observational study using ED data from the Emergency Department Data Registry. We compared baseline ED attendance characteristics between the tournament and the reference period, defined as two weeks before and two weeks after the tournament, and between Germany game days and non-Germany game days. Hourly attendance patterns were analysed for all Germany games using a reference range. Results: We included data from 41 EDs, totalling 253,493 attendances during the study period. A 1.57% increase in attendance was observed during the tournament compared to the reference period, with baseline characteristics remaining similar. The median daily attendance within all EDs was slightly lower on Germany game days (4066) compared to non-Germany game days (4128). Modest changes were observed in the hourly attendance on Germany game days, most notable during the last Germany game where a decrease in attendance below the reference range extended over three hours. Conclusions: The observed shifts in ED attendance were minimal, suggesting that no major changes of public health relevance occurred in ED attendance during the tournament. We highlight the utility of using ED data for monitoring and for enhancing the understanding of the public health risks and challenges associated with MGEs.
Benning, L.; Hirsch, A.; Groeschel, M.; Roeschl, T.; Spott, M.; Hans, F. P.; Urban, T.; Busch, H.-J.; Meyer, A.; Madrid, J.
Show abstract
Background Emergency department (ED) triage is a high-stakes clinical decision process that determines patient prioritization and resource allocation under time pressure. Large language models (LLMs) have recently been proposed as decision-support tools for triage, yet most evaluations rely on simulated scenarios or curated datasets. Evidence from real-world clinical environments remains limited. The objective of this project was to systematically evaluate the performance, calibration, and reproducibility of multiple contemporary large language models for Emergency Severity Index (ESI) classification and sectoral allocation (ED vs. urgent care practice, UCP) using a comprehensive real-world triage dataset. Material and Methods Retrospective cross-sectional benchmarking study conducted at a tertiary academic emergency ED in Germany with an integrated central point of assessment (CPA). The study included all consecutive adult walk-in encounters (>18 years) presenting between October 2023 and February 2024 (N = 16,107). Data were collected from a structured clinical decision support system capturing presenting complaints, vital signs, and triage decisions recorded by specialized nursing staff. Structured clinical variables routinely collected at triage, including presenting complaint categories (CEDIS-PCL), vital signs according to the ABCDE framework, and additional structured or free-text clinical information. Results The primary outcome was the agreement between LLM-predicted and nurse-assigned ESI levels measured using quadratic-weighted Cohen's k. Secondary outcomes included sectoral assignment agreement, misclassification patterns (over- and under-triage), calibration metrics, and output reproducibility. Quadratic-weighted k values ranged from 0.18 to 0.75 across models. Only a structured stepwise prompting strategy achieved substantial agreement (k_qw = 0.747), approaching reported human inter-rater reliability. Most models demonstrated moderate or lower agreement and systematic overconfidence, with expected calibration errors (ECE) based on verbalized confidence ranging from 0.099 to 0.355. Sectoral assignment agreement (i.e. ED vs. urgent care practice, UCP) was uniformly low (k < 0.30). Reproducibility testing revealed substantial variability in 23% of cases, indicating non-deterministic output behavior for clinically relevant decisions. Conclusions Current large language models demonstrate heterogeneous and generally limited performance in real-world emergency triage tasks. Structured algorithm-guided prompting appears more influential than model architecture or size. Before clinical implementation, improvements in calibration, reliability, and workflow integration are required, alongside regulatory-compliant validation in prospective clinical settings.
Tambo, J. M.
Show abstract
BackgroundThe emergency department (ED) serves as a critical entry point into hospital care and a sentinel indicator of health system performance. In-hospital mortality within 48 hours of ED admission represents acute care failures that are often preventable yet remain poorly characterized in sub-Saharan African (SSA) settings. This study aimed to identify the demographic, clinical, and hospital-related determinants of in-hospital mortality within 48 hours of admission to the Emergency and Urgent Care Department at the University Teaching Hospital (UTH), Lusaka, Zambia. MethodsA retrospective cross-sectional study was conducted using 385 patient records from UTHs Emergency and Urgent Care Department for the year 2021. Data were extracted from the District Health Information System 2 (DHIS2) using simple random sampling. Descriptive statistics, univariate, and multivariable logistic regression analyses were performed using STATA 16.1 MP. Variables with p<0.20 in univariate analysis were retained for adjusted modelling. Multicollinearity was assessed via variance inflation factors (VIF <5). Model fit was evaluated using the Hosmer-Lemeshow goodness-of-fit test and receiver operating characteristic (ROC) analysis. ResultsOf 385 patients, 175 (45.5%) died within 48 hours of admission. Patients who died were older (median age 45 vs. 37.5 years, p<0.001). In multivariable analysis, three variables were independently associated with 48-hour mortality: pulse rate (aOR = 0.98, 95% CI: 0.95-1.00, p = 0.036), Glasgow Coma Scale (GCS) score (aOR = 0.75, 95% CI: 0.63-0.90, p = 0.002), and out-of-hours admission between 00:00-07:59 (aOR = 11.44, 95% CI: 1.19-109.96, p = 0.035). Age was a significant predictor in univariate analysis but not in the adjusted model, indicating confounding. The model demonstrated good discriminatory ability (AUC = 0.81). ConclusionsReduced pulse rate, lower GCS score at admission, and out-of-hours presentation are independent determinants of 48-hour in-hospital mortality at UTH. These findings underscore the need for enhanced vital sign monitoring protocols, targeted staffing during overnight hours, and improved risk stratification tools in resource-constrained emergency care settings. The wide confidence interval for the time-of-admission finding warrants cautious interpretation and validation in future prospective studies.